"The highlighted tokens are primarily morphemes, syllables, or word segments in Thai, Hindi, and other languages, often marking key semantic units such as time periods (e.g., \"century\"), proper nouns, or important grammatical structures. There is a strong emphasis on tokens that form or contribute to compound nouns, temporal expressions, and institutional or historical references, especially those denoting centuries or significant eras."
Score Type | Accuracy | Precision | Recall | F1 score | TPR | TNR | FPR | FNR |
---|---|---|---|---|---|---|---|---|
detection | 0.83 | 0.851 | 0.8 | 0.825 | 0.8 | 0.86 | 0.14 | 0.2 |
fuzz | 0.81 | 0.844 | 0.76 | 0.8 | 0.76 | 0.86 | 0.14 | 0.24 |